Do not partition parameters smaller than this threshold. Smaller values use less memory, but can greatly increase communication and slow down training and validating. (especially latency-bound messages).
